Source | # of sentences | Average logarithmic rank |
---|---|---|
http://www.aftonbladet.se/sportbladet/fotboll/landslaget/article1259575.ab | 15 | 4.27 |
http://www.aftonbladet.se/vss/nyheter/story/0,2789,909432,00.html | 13 | 4.47 |
http://www.aftonbladet.se/vss/rss/story/0,2789,1115135,00.html | 11 | 4.47 |
http://www.expressen.se/index.jsp?d=354&a=428813 | 12 | 4.49 |
http://www.gp.se/gp/jsp/Crosslink.jsp?d=128&a=236352&ref=rss | 14 | 4.49 |
http://www.expressen.se/index.jsp?d=354&a=407242 | 19 | 4.50 |
http://www.gp.se/gp/jsp/Crosslink.jsp?d=128&a=233253&ref=rss | 12 | 4.52 |
http://www.expressen.se/index.jsp?d=656&a=422587 | 12 | 4.52 |
http://www.aftonbladet.se/sportbladet/fotboll/article964837.ab | 13 | 4.52 |
http://www.expressen.se/index.jsp?a=403265 | 20 | 4.53 |
http://www.aftonbladet.se/vss/sport/story/0,2789,650002,00.html | 18 | 4.55 |
http://www.aftonbladet.se/vss/sport/story/0,2789,681290,00.html | 16 | 4.55 |
http://www.gp.se/gp/jsp/Crosslink.jsp?d=128&a=233074&ref=rss | 11 | 4.56 |
http://www.gp.se/gp/jsp/Crosslink.jsp?d=128&a=214157&ref=rss | 14 | 4.59 |
http://www.dn.se/DNet/jsp/polopoly.jsp?d=647&a=632147&rss=1403 | 20 | 4.60 |
http://www.expressen.se/index.jsp?d=354&a=402974 | 25 | 4.60 |
http://www.arbetarbladet.se/sports_local_1.php?id=328358&avdelning_1=102&avdelning_2=118&variabel=ISHvinj.gif | 12 | 4.60 |
http://www.arbetarbladet.se/article.php?id=377322&avdelning_1=102&avdelning_2=114 | 22 | 4.62 |
http://www.aftonbladet.se/vss/debatt/story/0,2789,689466,00.html | 21 | 4.62 |
http://www.arbetarbladet.se/sports_local_1.php?id=320673&avdelning_1=102&avdelning_2=115&variabel=FOTvinj.gif | 11 | 4.63 |
http://www.arbetarbladet.se/sports_local_1.php?id=329813&avdelning_1=102&avdelning_2=113&variabel=BANvinj.gif | 11 | 4.63 |
http://www.expressen.se/index.jsp?d=977&a=400395 | 20 | 4.64 |
http://www.gp.se/gp/jsp/Crosslink.jsp?d=119&a=233084&ref=rss | 12 | 4.65 |
http://www.expressen.se/index.jsp?a=309944 | 18 | 4.65 |
http://www.norran.se/sektion_c.php?id=498692&avdelning_1=101&avdelning_2=0 | 20 | 4.65 |
http://www.aftonbladet.se/vss/sport/story/0,2789,644997,00.html | 11 | 4.66 |
http://www.aftonbladet.se/vss/sport/story/0,2789,863716,00.html | 12 | 4.66 |
http://www.expressen.se/index.jsp?d=656&a=435469 | 12 | 4.66 |
http://www.gp.se/gp/jsp/Crosslink.jsp?d=119&a=233241&ref=rss | 11 | 4.67 |
http://www.aftonbladet.se/vss/sport/story/0,2789,640007,00.html | 15 | 4.67 |
Source | # of sentences | Average logarithmic rank |
---|---|---|
http://www.arbetarbladet.se/local_article.php?id=306734&avdelning_1=101&avdelning_2=101&variabel=GAVvinj.gif | 13 | 8.95 |
http://www.dn.se/DNet/jsp/polopoly.jsp?d=1058&a=630125&rss=1399 | 11 | 8.84 |
http://sydsvenskan.se/sport/article264171.ece | 11 | 8.60 |
http://www.arbetarbladet.se/local_article.php?id=317664&avdelning_1=0&avdelning_2=101&variabel=GAVvinj.gif | 17 | 8.40 |
http://www.ljp.se/20050708/artiklar/L1_20050708_012_12_3.htm | 42 | 8.33 |
http://sydsvenskan.se/sverige/article246537.ece | 29 | 8.32 |
http://www.svd.se/ego/140/http://www.svd.se/dynamiskt/noje/did_14313445.asp | 17 | 8.26 |
http://sydsvenskan.se/dygnetrunt/article266189.ece | 42 | 8.20 |
http://www.gp.se/gp/jsp/Crosslink.jsp?d=286&a=217790&ref=rss | 24 | 8.19 |
http://www.aftonbladet.se/vss/sport/story/0,2789,765400,00.html | 12 | 7.93 |
http://www.sva.se/dokument/stdmall.html?id=1261 | 11 | 7.86 |
http://sydsvenskan.se/malmo/article238251.ece | 13 | 7.76 |
http://www.aftonbladet.se/vss/halsa/story/0,2789,693397,00.html | 17 | 7.73 |
http://www.gotlandska.se/artikel-rss.asp?ID=19104 | 13 | 7.71 |
http://www.dn.se/DNet/jsp/polopoly.jsp?d=147&a=681507&rss=1400 | 13 | 7.65 |
http://www.expressen.se/index.jsp?d=2604&a=800436 | 27 | 7.61 |
http://www.gotlandska.se/artikel-rss.asp?ID=18907 | 12 | 7.59 |
http://www.aftonbladet.se/mode/article1325329.ab | 12 | 7.58 |
http://www.gotlandska.se/artikel-rss.asp?ID=20998 | 13 | 7.57 |
http://www.svd.se/dynamiskt/kultur/did_16268575.asp | 16 | 7.57 |
http://www.gotlandska.se/artikel-rss.asp?ID=21168 | 15 | 7.53 |
http://sydsvenskan.se/lund/article288546.ece | 20 | 7.51 |
http://www.arbetarbladet.se/local_article.php?id=562536&avdelning_1=102&avdelning_2=199&variabel=RIDvinj.gif | 12 | 7.49 |
http://www.aftonbladet.se/vss/halsa/story/0,2789,680541,00.html | 17 | 7.48 |
http://www.arbetarbladet.se/sports_local_1.php?id=328450&avdelning_1=102&avdelning_2=199&variabel=RIDvinj.gif | 13 | 7.45 |
http://www.dn.se/DNet/jsp/polopoly.jsp?d=647&a=729026&rss=1403 | 25 | 7.43 |
http://www.aftonbladet.se/vss/rss/story/0,2789,1105037,00.html | 38 | 7.41 |
http://www.aftonbladet.se/vss/matovin/story/0,2789,633911,00.html | 35 | 7.40 |
http://www.aftonbladet.se/vss/matovin/story/0,2789,678715,00.html | 14 | 7.39 |
http://www.aftonbladet.se/kultur/article1427386.ab | 12 | 7.39 |
In this subsection we replace average word length by average logarithmic word rank. The logarithm of the word rank is taken because we want to punish words of high ranks only moderately.
First table:
select source, count(distinct i_s.s_id) as cnt_s, round(avg(log(w.w_id-100)),2) as av from sources so, inv_so i_s, inv_w i, words w where so.so_id=i_s.so_id and i_s.s_id=i.s_id and i.w_id=w.w_id and w.w_id>100 group by source having cnt_s>10 order by av LIMIT 30;
6.4.2.1 Average word length for different sources
6.4.2.3 Sources consisting of many / few words with frequency 1
6.4.2.4 Sources with low / high average word length of rare words